#LLM Agent

Research

DCA-Bench: A Benchmark for Dataset Curation Agents
Benhao Huang, 
Yingzhuo Yu, 
Jin Huang, 
Xingjian Zhang, 
Jiaqi W. Ma
#LLM Agent
#Benchmark

A benchmark exploring the performance of LLM Agents on detecting issues in datasets hosted on popular platforms. (Under Review)

paper
code